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Abstract 

In this paper, an efficient modified Newton type algorithm is proposed for nonlinear unconstrianed 
optimization problems. The modified Hessian is a convex combination of the identity matrix (for 
steepest descent algorithm) and the Hessian matrix (for Newton algorithm). The coefficients of 
the convex combination are dynamically chosen in every iteration. The algorithm is proved to be 
globally and quadratically convergent for (convex and nonconvex) nonlinear functions. Efficient 
implementation is described. Numerical test on widely used CUTE test problems is conducted for 
the new algorithm. The test results are compared with those obtained by MATLAB optimization 
toolbox function fminunc. The test results are also compared with those obtained by some established 
and state-of-the-art algorithms, such as a limited memory BFGS, a descent and conjugate gradient 
algorithm, and a limited memory and descent conjugate gradient algorithm. The comparisons show 
that the new algorithm is promising. 
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1 Introduction 



Newton type algorithm is attractive due to its fast convergence rate f2|. In non-convex case, Newton 
algorithm may not be globally convergent, therefore, various modified Newton algorithms have been 
proposed, for example [5] |10j . The idea is to add a positive diagonal matrix to the Hessian matrix so 
that the modified Hessian is positive definite and the modified algorithms become globally convergent, 
which is similar to the idea of Levenberg-Marquardt method studied in [52]. However, for the iterates 
far away from the solution set, the added diagonal matrix may be very large. This may lead to the 
poor condition number of the modified Hessian, generate a very small step, and prevent the iterates from 
quickly moving to the solution set [7j. 

In this paper, we propose a slightly different modified Newton algorithm. The modified Hessian is a 
convex combination of the Hessian (for Newton algorithm) and the identity matrix (for steepest descent 
algorithm). Therefore, the condition number of the modified Hessian is well controlled, and the steepest 
descent algorithm and Newton algorithm are special cases of the proposed algorithm. We will show that 
the proposed algorithm has merits of both the steepest descent algorithm and the Newton algorithm, 
i.e., the algorithm is globally and quadratically convergent. We will also show that the algorithm can be 
implemented in an efficient way, using the optimization techniques on Riemannian manifolds proposed in 
[S] , [IH] , [IS] , and . Numerical test for the new algorithm is conducted for the widely used nonlinear 
optimization test problem set CUTE downloaded from [1 . The test results are compared with those 
obtained by MATLAB optimization toolbox function fminunc. The test results are also compared with 
those obtained by some established and state-of-the-art algorithms, such as limited memory BFGS fTT], 
a descent and conjugate gradient algorithm [14 , and a limited memory and descent conjugate gradient 
algorithm [13 . The comparison shows that the new algorithm is promising. 

The rest paper is organized as follows. Section 2 proposes the modified Newton algorithm and provides 
the convergence results. Section 3 discusses an efficient implementation involving calculations of the 
maximum and minimum eigenvalues of the modified Hessian matrix. Section 4 presents numerical test 
results. The last section summarizes the main result of this paper. 

2 Modified Newton Method 

Our objective is to minimize a multi-variable nonlinear (convex or non-convex) function 

min/(a;), (1) 

where / is twice differentiable. Throughout the paper, we define by g{x) or simply by g the gradient 
of /(x), by H{x) or simply by H the Hessian of f{x), by XmaxH{x) or simply \max{H) the maximum 
eigenvalue of H[x), by \minH{x) or simply Xmin{H) the minimum eigenvalue of H{x). Assuming that x 
is a local minimizer, we make the following assumptions in our convergence analysis. 

Assumptions: 

1. g{x)^Q. 

2. The gradient g{x) is Lipschitz continuous, i.e., there exists a constant L > such that for any x 
and y, 

\\g{x)-g{y)\\<L\\x-y\\. (2) 

3. There are small positive numbers (5 > 0, 77 > 0, and a large positive number A > 1, and a 
neighborhood of x, defined by N{x) — {x : \\g{x) — g{x)\\ < 77}, such that for aU x G J^{x), 

{H{x)) > 5 > and A 

max 

{H)/X,mn{H) < A. 

Assumptions 1 is standard, i.e., x meets the first order necessary condition. If the gradient is Lipschitz 
continuous as defined in Assumption 2, then Af{x) is well defined. Assumption 3 indicates that for 
all X e J^{x), a strong second order sufficient condition holds, and the condition number of Hessian is 
bounded which is equivalent to Xmax{H) < 00 given Xm,in{H{x)) > S. 
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In the remaining discussion, we will use subscript k for the kth iteration. The idea of the proposed 
algorithm is to search optimizers along a direction dk that satisfies 



(7fe/ + (1 - ^k)H{xk))dk = Bkdk = -g{xk), (3) 

where 7^ € [0, 1] will be carefully selected in every iteration. Clearly, the modified Hessian is a con- 
vex combination of the identity matrix for steepest descent algorithm and the Hessian for the Newton 
algorithm. When 7^, = 1, the algorithm reduces to the steepest descent algorithm; when 7^ = 0, the 
algorithm reduces to the Newton algorithm. We will focus on the selection of 7^ , and we will prove the 
global and quadratic convergence of the proposed algorithm. The convergence properties are directly 
related to the goodness of the search direction and step length, which in turn decide the selection criteria 
of 7fe . The quality of the search direction is measured by 

^_ 9ldk 



which should be bounded below from zero in all iterations. A good step length ak should satisfy the 
following Wolfe condition. 

/(xfc + ttfedfe) < f{xk) + (Jiakg^dk, (5a) 

g{xk + atdk) > (J2gldk, (5b) 

where < tJi < 0-2 < 1- The existence of Wolfe condition is established in [501 HI]- The proposed 
algorithm is given as follows. 

Algorithm 2.1 Modified Newton 

Data: Q < 5, and 1 < A < 00, initial xq. 
for k=0J,2,... 

Calculate gradient g{xk). 

Calculate Hessian H{xk), select 7^, and calculate dk from (0). 
Select ttfe and set Xk+i = Xk + ctkdk- 

end 

Remark 2.1 An algorithm that finds, in finite steps, a point satisfying Wolfe condition is given in J 16}/ . 
Therefore, the selection of ak will not be discussed in this paper. 

We will use an important global convergence result given by Zoutendijk [5S] which can be stated as 
follows. 

Theorem 2.1 Suppose that f is bounded below in R" and that f is continuously twice differentiable in a 
neighborhood A4 of the level set C — {x : f{x) < /(xq)}. Assume that the gradient is Lipschitz continuous 
for all x,y € AA. Assume further that dk is a descent direction and ak satisfies the Wolfe condition. 
Then 

^cos2(0fe)||5fe||2 <oo. (6) 

fc>0 



Zoutendijk theorem indicates that if for all k > 0, dk is a descent direction; and for a constant C, 
cos(6'fe) > C > 0, then the algorithm is globally convergent because hmfe_i.oo ||5fc|| = 0. To assure that dk 
is a descent direction, Bk should be strictly positive. This can be achieved by setting 

7fc + (1 - lk)>^rmn{Hk) > S, (7) 
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which is equivalent to 

7fe(l - A™„(i/fc)) > <5 - A„„,(ilfc). (8) 

Therefore, we set 

if Xrnin(Hk) > 0. 



In view of and (H)), it is clear that if 



S.llllSfe^ll <A, (10) 



where 1 < A < oo, then cos(0fe) > 1/A = C > 0. Therefore, in view of Theorem 12.11 to achieve the 
global convergence, from ^ and (fTO)) . A should meet the following condition 

7fc + (1 ^ lk)\max{Hk) ^ ^ j,^^^ 

7fc + (l-7fc)Ami„(iJfc) ~ 
Using dZl) and 7fe + (1 - lk)>^min{Hk) > 0, we have 

(A — 1 + Xmax(Hk) — X,nin{Hk)^)jk > Xmax{Hk) — \min(Hk)^- (12) 

Since A - 1 + X,nax{Hk) ~ Xmin{Hk)/^ > X,nax{Hk) - Xmin{Hk)A, wc should select 

{0 if XmaxiHk) < Xmin{Hk)A 

A-l+X,„a:.(Hk}-X„,,„(Hk)A " \nax[J^k) > ■^mmi-tl k ) i^. 

Combining ^ and yields 



if XmtniHk) > S and X„iax{Hk) < Xjnin{Hk)A 

S — X7nin(Hk) 

7fe=S A„_ -A„.„(H.)A , ... X . ^ . x . ^ a (^4) 



'^'^ ^ 1-Ami"|gfc| Xmin{Hk) < S and Xmax{Hk) < Xmin{Hk)A 

bk = A - 1 TTm aAHk y-Xm in{Hk)A Xmin{Hk) > l5 and Xmax{Hk) > Xmin{Hk)A. 



max 



{ak,bk} else 



It is clear to see from the selection of 7^ that ([5]) and (|12p hold. This means that the conditions of 
Theorem 12.11 hold. Therefore, Algorithm 12. II is globally convergent. 

Since Algorithm 12.11 is globally convergent in the sense that limfe_^oo \\gk\\ = 0, there exists an 
77 > such that for k large enough, ||g(a;fc)|| < 77; from Assumption 3, X,nin{H{xk)) > (5 > and 
Xmax{Hk) / Xmin{Hk) < A. From (fT4|). 7;^ = for aU k large enough, i.e.. Algorithm 12.11 reduces to 
Newton algorithm. Therefore, the proposed algorithm is quadratic convergent. We summarize the main 
result of this paper as the following 

Theorem 2.2 Suppose that f is bounded below in R" and that f is continuously twice differentiable in a 
neighborhood A4 of the level set C = {x : f{x) < f{xo)}. Assume that the gradient is Lipschitz continuous 
for all x,y G AA. Assume further that dk is defined as in ^ with jk being selected as in ( |j^[ ) and ak 
satisfies the Wolfe condition. Then Algorithm \2.1\ is globally convergent. Moreover, if the convergent 
point X satisfies Assumption 3, then Alaorithm \2.1\ converges to x in quadratic rate. | 

Remark 2.2 Since Bk is positive definite, Cholesky factorization exist, (0) can be solved efficiently. 
Furthermore, if Hk is sparse, (0) can be solved using techniques for a sparse matrix. 

3 Implementation Consideration 

To implement Algorithm 12. II for practical use, we need to consider several issues. 
3.1 Termination 

First, we need to have a termination rule in Algorithm 12. II This rule is checked at the end of Step 1. For 
< e, if |lg(a;fe)|| < e or ||5(a;fc)||oo < e, stop. 
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3.2 Computation of extreme eigenvalues 



The most significant computation in the proposed algorithm is the selection of 7^, which involves the 
computation of Xmax{Hk) and \min{Hk) for the symmetric matrix H. There are general algorithms to 
compute eigenvalues and eigenvectors for a symmetric matrix However, there are much more efficient 
algorithms for extreme eigenvalues for a symmetric matrix, which is equivalent to find the solution of 
Rayleigh Quotient 

\yiax{Hk) = max x'^HkX = max ''^ , (15) 

||a:|| = l X'^X 

Amin(-fffc) = min x^ HkX = min ^ . (16) 

It is well-known that there are cubically convergent algorithms to find the solution of Rayleigh Quotient 
[S]. In our opinion, the most efficient methods are the conjugate gradient optimization algorithm on 
Riemannian manifold proposed by Smith |18j . and the Armijo- Newton optimization algorithm on Rie- 
mannian manifold proposed by Yang [23] [24]. Both methods make fully use of the geometry of the 
unit sphere (||a:|| = 1) and search the solution along the arc defined by geodesies over the unit sphere. 
Armijo-Newton algorithm may converge faster, but it may converge to an internal eigenvalue rather than 
an extreme eigenvalue. Conjugate gradient optimization algorithm may also converge to an internal 
eigenvalue, but the chance is much smaller and a small perturbation may lead the iterate to converge 
to the desired extreme eigenvalues. Let x be on unit sphere and p{x) = x"^ Hx. For vector v in tangent 
space at x, tv denote parallelism of v along the geodesic defined by a unit length vector q in tangent 
space at x, it is shown in |18] 

TV = V — {v^ q){x s\n{t) + q{l — cosit)). (17) 

To find the maximum eigenvalue of H defined in (jlSp . the conjugate gradient algorithm proposed in '18' 
is stated as follows (with very minor but important modification presented in bold font). 

Algorithm 3.1 Conjugate gradient (CG) for mELximum eigenvalue 

Data: < e, initial xq with \\xq\\ = 1, Go = Qo = [H — p{xq)I)xq. 
for k=0,l,2,... 

Calculate c, s, and v = 1 — c, such that p{xkC + qks) is maximized, where + s"^ = I, qk = ]fQj][- 
This can be accomplished by geodesic maximization and the formula is given by 



c 



./i(l + -), s = T^, ifb>0 

y 2 V r/' 2rc ' — (18) 



where a — 2x^Hqk, b = xjHxk — q^Hqk, and r = \/a? + lP. 

Set Xk+i = XkC + qkS, Xk+i = ]j^^77|| ; ^Qk = QkC- Xk\\Qk\\s, and rGk = Gfe - {qjGk){xkS + qkv). 

Set Gfe+i = {H - p{xk+i)I)xk+i, Qk+i = Gk+i + PkrQk, where p,k = '''^"^'^JtqI '^"^^ ■ 

Set Qk+i = (I - Xk+ixJ_^j)Qk+i. 

Ifk = n~l, set Qk+i = (I - Xk+ixJ_^^)Gfc+i . 

end 



Remark 3.1 Qk+i should be on tangent space at Xk+i- But numerical error may change Qk+i slightly. 
Therefore, the projection is necessary to bring Qk+i back to the tangent space at Xk+i- Similar changes 
are made to ensure the unit length of Xk- With these minor changes, the CG algorithm is much more 
stable and the observed convergence rate is faster than the one reported in U8f . 
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Remark 3.2 To search for the minimum eigenvalue of mb]) . Iil8\) is replaced by 
which is obtained by minimizing p{xkC + qks) under the constraint c^ + s^ ~ 1. 

Remark 3.3 Each iteration of Algorithm \3.1\ involves only matrix and vector multiplications, the cost 
0{n'^) is very low. Our experience shows that it needs only a few iterations to converge to the extreme 
eigenvalues. 

Remark 3.4 If H is sparse, Alaorithm lS. 1\ will be very efficient. 

3.3 The implemented modified Newton algorithm 

The implemented modified Newton algoritlim is as follows. 

Algorithm 3.2 Modified Newton 

Data: < S , < e, and 1 < A < oo, initial Xq with \\xq\\ — 1. 
for k=l,2,... 

Calculate gradient g{xk). 

If \\9{^k)\\ < e or \\g{xk)\\oc < e, stop. 

Calculate Hessian H(xk). 

Calculate Xmax{Hk) o.i^'d \nin{Hk) using Alaorithm lS. 11 
Select 7fc using ^14^ , and calculate dk using (0)- 

// dk is not a descent direction. Algorithm \3.1\ generates an internal eigevalue. A conventional 
method will be used to find Xmax{Hk) o,nd Xmin{Hk). Then, select 7^ using ^14^ , and calculate dk 
using (0). 

Select ak using one dimensional search and set Xk+i — Xk + dkdk- 

end 

Remark 3.5 It is very rare to use a conventional method to calculate Xmax{Hk) cind XmmiHk) ■ But this 
safeguard is needed in case that Alaorithm lS. 1\ generates an internal eigevalue. 

4 Numerical Test 

In this section, we present some test results for both Algorithm 13.11 and Algorithm 13.21 
4.1 Test of Algorithm [gTTl 

The advantages of Algorithm 13.11 have been explained in [B] . We conducted numerical test on some 
problems to confirm the theoretical analysis. For the sake of comparison, we use an example in [22] 
because it provides detailed information about the test problem and the results obtained by many other 
algorithms. For this problem, 

ro ri ■ ■■ ri5 
ri ro • • • ri4 
H= . . . , 

^15 ri4 ■ ■■ ro 
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where tq = 1.00000000, ri = 0.91189350, r2 = 0.75982820, = 0.59792770, = 0.41953610, rg = 
0.27267350, rg = 0.13446390, = 0.00821722, rg = -0.09794101, rg = -0.21197350, = -0.30446960, 
m = -0.34471370, ria = -0.34736840, ri3 = -0.32881280, ru = -0.29269750, rig = -0.24512650. The 
minimum eigenvalue is Amm = 0.00325850037049. Four methods, namely HE, TJ, FR, and CA, which 
use formulae derived from [1], [S], [12], and [35],, are tested and reported in [22]. These test results are 
compared with our test obtained by Algorithm 13.11 (CG). The comparison is presented in Table 1. The 
result is clearly in favor of Algorithm 13. II (CG). 

Table 1: Simulation results of 5 algorithms for the test problem 





(-1,1,-1,...)^ 


(1,0,- •• ,0)^ 


Algo 


iter 




iter 




HE 


24 


0.0032585 


77 


0.0032586 


TJ 


26 


0.0032585 


65 


0.0032586 


FR 


17 


0.0032585 


87 


0.0032586 


CA 


32 


0.0032585 


124 


0.0032586 


CG 


10 


0.0032585 


14 


0.0032585 



4.2 Test of Algorithm 13.21 on Rosenbrock function 

Algorithm 13.21 is implemented in Matlab function mNewton. The following parameters are chosen: 5 — 
10~*, A = 10^^, and e = 10^^. A test for Algorithm 13.21 is done for Rosenbrock function given by 

/(x) = 100(a;2 - + (1 - 

with initial point xq = [—1.9,2.0]"'". Steepest descent is inefficient in this problem. After 1000 iterations, 
the iterate is still a considerable distance from the minimum point x* = [lil]"""- BFGS algorithm is 
significantly better, after 34 iterations, the iterate terminates at a; = [0.9998,0.9996]"'' (cf. [15]). The 
new algorithm performs even better, after 24 iterations, the iterate terminates at a; = [0.9999, 0.9998]""". 
Similar to BFGS algorithm, the new method is able to follow the shape of the valley and converges to 
the minimum as depicted in Figure 1, where the contour of the Rosenbrock function, the gradient flow 
from the initial point to the minimum point (in blue line), and all iterates (in red "x") are plotted. 




-2 -1.5 -1 -0.5 0.5 1 1.5 2 



Figure 1: New algorithm searches follows the shape of the valley of Rosenbrock function. 
4.3 Test of Algorithm [32] on CUTE problems 

We also conducted test for both mNewton and Matlab optimization toolbox function fminunc against 
CUTE test problem set. fminunc options are set as 

options = optimset('MaxFunEvals',le-K20,'MaxIter',5e+5,'TolFun',le-20, 'TolX',le-10). 
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This setting is selected to ensure that the Matlab function f minunc wih have enough iterations to converge 
or to fail. CUTE test problem set is downloaded from Princeton test problem collections [1] . Since CUTE 
test set is presented in AMPL mod-files, we first convert AMPL mod-files into nl-files so that Matlab 
functions can read the CUTE models, then we use Matlab functions mNewton and fminunc to read the 
nl-files and solve these test problems. Because the conversion software which converts mod-files to nl-files 
is restricted to problems whose sizes are smaller than 300, the test is done for all CUTE unconstrained 
optimization problems whose sizes are less than 300. The test uses the initial points provided by CUTE 
test problem set, we record the calculated objective function values, the norms of the gradients at the 
final points, and the iteration numbers for these testing problems. We present the test results in Table 
2, and summarize the comparison of the test results as follows: 

1. the modified Newton function mNewton converges in all the test problems after terminate condition 

< 10~^ is met. But for about 40% of the problems, Matlab optimization toolbox function 
fminunc does not reduce to a value smaller than 0.01. For these problems, the objective 

functions obtained by fminunc normally are not close to the minimum; 

2. for problems that both mNewton and fminunc converge, mNewton normally uses less iterations than 
fminunc and converges to points with smaller ||5(xfc)|| except 2 problems bard and deconvu. 



Table 2: Test result for problems in CUTE [3,, initial points are 
given in CUTE 



Jr luUiClli 


iter 

illlN t.^ W LUli 


ob] 

iiiiN W LUii 


gradient 

llilM^W LUii 


iter 

iiliiliLlllL> 


obj 

llliiiiLliiL 


gradient 

iiliiiiLliiL 


ci yrr li n Q 


1 

± 


1 no nnnnnn 


nnnnnnnnp q 


A 


1 00 000000 


0001 fifi90 


belli' d 


94 






zyj 


00S91 4S7 


oil ^S'^Sl p ^ 




6 


OOOOOOOOp-9 


38908644p-q 


15 


00000024P-5 


1 3q2q42qp-5 


brkmcc 


2 


0.16904267 


0.61053106e-5 


5 


0.16904268 


0.0454266e-5 


brownal 


7 


O.OOOOOOOOe-7 


0.26143011e-7 


16 


0.00030509e-5 


0.00010437 


brownbs 


8 


O.OOOOOOOOe-9 


O.OOOOOOOOc-9 


11 


0.00009308 


15798.5950 


brownden 


8 


85822.2016 


0.00003000C-5 


32 


85822.2017 


0.46462733 


chnrosnb 


46 


O.OOOOOOOOe-5 


0.10455150e-5 


98 


30.0583699 


10.1863739 


cliff 


26 


0.19978661 


0.10751025e-6 


1 


1.00159994 


1.41477930 


cube 


28 


O.OOOOOOOOe-9 


0.69055669e-9 


34 


0.79877450C-9 


0.00013409 


deconvu 


2612 


0.00242309e-5 


0.99584075e-5 


80 


0.00031582C-3 


0.1750297e-3 


dcnschna 


5 


0.00000022e-5 


0.29676520e-5 


10 


0.00000005e-5 


0.1581909e-5 


denschnb 


5 


0.00000004e-5 


0.17646764e-5 


7 


O.OOOOOOlOe-5 


0.2200204e-5 


dcnschnc 


11 


O.OOOOOOOOe-9 


0.17803850e-9 


21 


0.00000160C-3 


0.3262483e-3 


dcnschnd 


36 


0.00126578e-5 


0.77956675e-5 


23 


45.2971677 


84.5851141 


denschnf 


6 


O.OOOOOOOOe-9 


0.62887898e-9 


10 


0.00000002e-3 


0.1005028e-3 


dixon3dq 


1 


O.OOOOOOOOe-7 


O.OOOOOOOOe-7 


20 


0.00000014e-5 


0.3661452e-5 


eigenals 


22 


O.OOOOOOOOe-7 


0.45589372e-7 


78 


0.10928398e-2 


0.10292633 


eigenbls 


62 


O.OOOOOOOOe-6 


0.32395333e-6 


91 


0.34624147 


0.46420894 


engval2 


13 


O.OOOOOOOle-5 


0.36978724e-7 


29 


0.00003953e-5 


0.2799583e-3 


extrosnb 


1 


0.00000000 


0.00000000 


1 


0.00000000 


0.00000000 


fletcbv2 


1 


-0.5140067 


0.50699056e-5 


98 


-0.5140067 


0.1087190C-4 


fletchcr 


12 


O.OOOOOOOOe-7 


0.12606909e-7 


63 


68.128920 


160.987949 


genhumps 


52 


0.00000003e-7 


0.29148635e-7 


59 


0.00044932e-3 


0.3167733e-3 


hairy 


19 


20.0000000 


0.00065611C-5 


22 


20.000000000 


0.3810773e-4 


heart61s 


375 


O.OOOOOOOOe-5 


0.29136580C-5 


53 


0.63188192 


71.9382548 


helix 


13 


O.OOOOOOOOe-9 


0.31818245e-9 


29 


0.00000226e-5 


0.4196860e-4 


hilberta 


1 


0.00001538e-7 


0.92172479e-7 


35 


0.02289322e-5 


0.3263435e-5 


hilbcrtb 


1 


0.0000004e-20 


0.1267079e-12 


6 


0.00000021e-5 


0.6542441e-5 
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himmclbb 


7 


O.OOOOOOlc-13 


0.13251887e-6 


6 


0.00001462 


0.0012511 


himmelbh 


4 


-1.0000000 


0.00108475e-6 


7 


-0.9999999 


0.26071566-6 


humps 


26 


0.000003796-6 


0.275630836-5 


25 


5.42481702 


2.36255440 


jensmp 


9 


124.362182 


0.004802836-5 


16 


124.362182 


0.28970496-5 


kowosb 


10 


0.30750561e-3 


0.319308356-5 


33 


0.307505606-3 


0.01253756-5 


loghairy 


23 


0.18232155 


0.001031476-5 


11 


2.5199616136 


0.0053770 


mancino 


4 


O.OOOOOOOOe-5 


0.264367366-5 


9 


0.00220471 


1.22432874 


maratosb 


7 


-1.0000000 


0.093420006-9 


2 


-0.9997167 


0.03570911 


mexhat 


4 


-0.0401000 


0.000000006-5 


4 


-0.0400999 


0.13703956-4 


palmer Ic 


6 


0.09759802 


0.461616026-5 


38 


16139.4418 


655.015973 


palmer2c 


1 


0.01442139 


0.001077946-5 


60 


98.0867115 


33.4524366 


palmerSc 


1 


0.01953763 


0.004344786-6 


56 


54.3139592 


7.85183915 


palmer4c 


1 


0.05031069 


0.012659486-6 


56 


62.2623173 


6.67991745 


palmerSc 


1 


2.12808666 


0.000000016-5 


14 


2.12808668 


0.00074844 


palmer6c 


1 


0.01638742 


0.000082026-5 


43 


18.0992853 


0.78517164 


palmerTc 


1 


0.60198567 


0.001208386-5 


28 


56.9098797 


4.02685779 


palmerSc 


1 


0.15976806 


0.000132006-5 


49 


22.4365812 


1.31472249 


powellsq 




















rosenbr 


20 


0.000000026-5 


0.102282636-5 


36 


0.000002836-5 


2.60957256-5 


sineval 


41 


0.000000006-8 


0.243940836-8 


47 


0.22121569 


1.23159435 


sisser 


13 


0.000977416-5 


0.511135406-5 


11 


0.154092546-7 


0.72826716-5 


tointqor 


1 


1175.47222214 


0.00000000000 


40 


1175.4722221 


0.00904196-5 


vardim 


19 


0.000000006-8 


0.009919636-8 


1 


0.224450096-6 


0.55115494 


watson 


13 


0.152396356-6 


0.033394336-6 


90 


0.00105098 


0.48756107 


yfitu 


36 


0.000000666-6 


0.104187646-6 


57 


0.00439883 


11.8427717 



4.4 Comparison of Algorithm 13.21 to established and state-of-the-art algo- 
rithms 

Most of the above problems are also used, for example in [12 , to test some established and state-of-the- 
art algorithms. In [12], 145 CUTEr unconstrained problems are tested against limited memory BFGS 
algorithm [T7] (implemented as L-BFGS), a descent and conjugate gradient algorithm [IJ (implemented 
as CG-Descent 5.3), and a limited memory descent and conjugate gradient algorithm |13| (implemented 
as L-CG-Descent). The sizes of most of these test problems are smaller than or equal to 300. The size of 
the largest test problems in [T^] is 10000. Since our AMPL converion software does not work for problems 
whose sizes are larger than 300, we compare only problems whose sizes are less than or equal to 300. The 
test results obtained by algorithms descried in [T7l[T4l[T3] are reported in [12]. In this test, we changed 
the stopping criterion for Algorithm 13.21 to ||5(a;)||tx3 < 10~^ for consistency. The test results are listed in 
Table 3. 



Table 3: Comparison of mNewtow, L-CG-Descent, L-BFGS, and 
CG-Descent 5.3 for problems in CUTE [^, initial points are given 
in CUTE 



Problem 


size 


methods 


iter 


obj 


gradient 


arglina 


200 


mNewtow 


1 


1.0006+002 


3.2036-014 






L-CG-Descent 


1 


2.000e+002 


3.3846-008 






L-BFGS 


1 


2.000e+002 


3.3846-008 






CG-Descent 5.3 


1 


2.000e+002 


2.3906-007 



9 



bard 


3 


mNcwtow 


41 


1.157C-001 


9.765O-007 






L-CG-Descent 


16 


8.215e-003 


3.673e-009 






L-BFGS 


16 


8.215e-003 


3.673e-009 






CG-Dcscent 5.3 


21 


8.215C-003 


1.912C-007 


beale 


2 


mNewtow 


6 


4.957e-020 


2.979e-010 






L-CG-Descent 


15 


2.727e-015 


4.499e-008 






L-BFGS 


15 


2.727e-015 


4.499C-008 






CG-Descent 5.3 


18 


1.497e-007 


4.297e-007 


brkmcc 


2 


mNewtow 


3 


1.690e-001 


5.640e-013 






L-CG-Descent 


5 


1.690C-001 


6.220C-008 






L-BFGS 


5 


1.690e-001 


6.220e-008 






CG-Descent 5.3 


4 


1.690e-001 


5.272e-008 


brownbs 


2 


mNewtow 


8 


O.OOOc+000 


O.OOOe+000 






L-CG-Descent 


13 


O.OOOe-fOOO 


O.OOOe-t-000 






L-BFGS 


13 


O.OOOe-l-000 


O.OOOe-FOOO 






CG-Descent 5.3 


16 


1.972O-031 


8.882C-010 


brownden 


4 


mNewtow 


8 


8.582e-l-004 


3.092e-010 






L-CG-Descent 


16 


8.582e-)-004 


1.282e-007 






L-BFGS 


16 


8.582C+004 


1.282e-007 






CG-Descent 5.3 


38 


8.582e-|-004 


9.083e-007 


chnrosnb 


50 


mNewtow 


46 


1.885e-014 


7.155e-007 






L-CG-Dcsccnt 


287 


6.818C-014 


5.414C-007 






L-BFGS 


216 


1.582e-013 


5.565e-007 






CG-Descent 5.3 


287 


6.818e-014 


5.414e-007 


cliff 


2 


mNewtow 


26 


1.998e-001 


7.602C-008 






L-CG-Descent 


18 


1.998e-001 


2.316e-009 






L-BFGS 


18 


1.998e-001 


2.316e-009 






CG-Descent 5.3 


19 


1.998e-001 


6.352C-008 


cube 


2 


mNewtow 


28 


1.238e-017 


1.985e-007 






L-CG-Descent 


32 


1.269e-017 


1.225e-009 






L-BFGS 


32 


1.269e-017 


1.225C-009 






CG-Descent 5.3 


33 


6.059e-015 


4.697e-008 


deconvu 


61 


mNewtow 


84016 


1.567e-009 


9.999e-007 






L-CG-Dcsccnt 


475 


1.189e-008 


9.187e-007 






L-BFGS 


208 


2.171e-010 


8.924e-007 






CG-Descent 5.3 


475 


1.184e-008 


9.078e-007 


denschna 


2 


mNewtow 


6 


1.103C-023 


6.642e-012 






L-CG-Descent 


9 


3.167e-016 


3.527e-008 






L-BFGS 


9 


3.167e-016 


3.527e-008 






CG-Dcsccnt 5.3 


9 


7.355C-016 


4.825C-008 


denschnb 


2 


mNewtow 


6 


5.550C-026 


4.370C-013 






L-CG-Descent 


7 


3.641C-017 


1.034C-008 






L-BFGS 


7 


3.641C-017 


1.034C-008 






CG-Dcsccnt 5.3 


8 


4.702C-014 


4.131C-007 


denschnc 


2 


mNewtow 


11 


1.119C-021 


1.731C-010 






L-CG-Dcsccnt 


12 


3.253C-019 


3.276C-009 






L-BFGS 


12 


3.253C-019 


3.276C-009 






CG-Dcsccnt 5.3 


12 


1.834e-001 


4.143C-007 


denschnd 


3 


mNewtow 


40 


3.238e-010 


9.897e-007 






L-CG-Dcsccnt 


47 


4.331C-010 


8.483C-007 






L-BFGS 


47 


4.331e-010 


8.483e-007 






CG-Descent 5.3 


45 


8.800e-009 


6.115e-007 


denschnf 


2 


mNewtow 


6 


6.513e-022 


6.281e-010 



10 







L-CG-Descent 


8 


2.126e-015 


6.455e-007 






L-BFGS 


8 


2.126e-015 


6.455e-007 






CG-Dcsccnt 5.3 


11 


1.104C-017 


6.614C-008 


engval2 


3 


inNewtow 


13 


2.199e-019 


3.603e-008 






L-CG-Descent 


26 


1.034e-016 


8.236e-007 






L-BFGS 


26 


1.034C-016 


8.236C-007 






CG-Descent 5.3 


76 


3.185e-014 


5.682e-007 


hairy 


2 


mNewtow 


19 


2.000e-|-001 


1.149e-008 






L-CG-Dcsccnt 


36 


2.000C+001 


7.961C-011 






L-BFGS 


36 


2.000C+001 


7.961C-011 






CG-Dcscoiit -5.;-) 


14 


2.()()()c+()()l 


l.()44o-()()7 


heart61s 


6 


inNcwtow 


312 


1.038C-023 


2.993C-008 






L-CG-Descent 


684 


2.646e-010 


5.562e-007 






L-BFGS 


684 


2.646e-010 


5.562e-007 






CG-Dcsccnt 5.3 


2570 


1.305C-010 


2.421C-007 


helix 


3 


mNewtow 


13 


3.585e-022 


3.326e-010 






L-CG-Descent 


23 


1.604e-015 


3.135e-007 






L-BFGS 


23 


1.604C-015 


3.135C-007 






CG-Descent 5.3 


44 


2.427e-013 


6.444e-007 


himmelbb 


2 


mNewtow 


7 


7.783e-021 


1.325e-007 






L-CG-Desccnt 


10 


9.294C-013 


2.375C-007 






L-BFGS 


10 


9.294e-013 


2.375e-007 






CG-Descent 5.3 


11 


1.584e-013 


1.084e-008 


himmelbh 


2 


mNewtow 


4 


-l.OOOc+000 


1.085C-009 






L-CG-Descent 


7 


-l.OOOe-t-000 


2.892e-011 






L-BFGS 


7 


-l.OOOe-l-000 


2.892e-011 






CG-Dcscent 5.3 


7 


-l.OOOc+000 


1.381C-007 


humps 


2 


mNewtow 


37 


1.695e-013 


1.826e-007 






L-CG-Descent 


53 


3.682e-012 


8.552e-007 






L-BFGS 


53 


3.682C-012 


8.552e-007 






CG-Descent 5.3 


48 


3.916e-012 


8.774e-007 


jensmp 


2 


mNewtow 


10 


1.244e-|-002 


2.046e-012 






L-CG-Dcsccnt 


15 


1.244C+002 


5.302C-010 






L-BFGS 


15 


1.244e-|-002 


5.302e-010 






CG-Descent 5.3 


13 


1.244e-|-002 


4.206e-009 


kowosb 


4 


mNewtow 


10 


3.075e-004 


1.055e-007 






L-CG-Descent 


17 


3.078e-004 


3.704e-007 






L-BFGS 


17 


3.078e-004 


3.704e-007 






CG-Dcsccnt 5.3 


66 


3.078C-004 


8.818C-007 


loffhairv 


2 


mNewtow 


23 


1.823e-001 


1.880e-007 






L-CG-Descent 


27 


1.823e-001 


1.762e-007 






L-BFGS 


27 


1.823C-001 


1.762C-007 






CG-Descent 5.3 


46 


1.823e-001 


7.562e-008 


mancino 


100 


mNewtow 


5 


1.257e-021 


4.659e-008 






L-CG-Desccnt 


11 


9.245C-021 


7.239C-008 






L-BFGS 


9 


3.048e-021 


1.576e-007 






CG-Descent 5.3 


11 


9.245e-021 


7.239e-008 


maratosb 


2 


mNewtow 


7 


-l.OOOc+000 


9.342C-011 






L-CG-Descent 


1145 


-l.OOOe-hOOO 


3.216e-007 






L-BFGS 


1145 


-l.OOOe-l-000 


3.216e-007 






CG-Descent 5.3 


946 


-l.OOOe-hOOO 


3.230e-009 


mexhat 


2 


mNewtow 


4 


-4.010e-002 


1.972e-011 






L-CG-Descent 


20 


-4.001e-002 


4.934e-009 



11 







L-BFGS 


20 


-4.001e-002 


4.934e-009 






CG-Descent 5.3 


27 


-4.001e-002 


3.014e-007 


palmer Ic 


8 


mNcwtow 


7 


9.760C-002 


6.619e-007 






L- C G-Descent 


11 


9.761e-002 


1.254e-009 






L-BFGS 


11 


9.761e-002 


1.254e-009 






CG-Dcsccnt 5.3 


126827 


9.761e-002 


9.545e-007 


palnier2c 


8 


mNewtow 


1 


1.442e-002 


1.023e-008 






L-CG-Descent 


11 


1.437e-002 


1.257e-008 






L-BFGS 


11 


1.437e-002 


1.257e-008 






CG-Descent 5.3 


21362 


1.437e-002 


5.761e-007 


palmer 3c 


8 


mNewtow 


1 


1.954e-002 


3.958e-009 






L-CG-Dcsccnt 


11 


1.954e-002 


1.754e-010 






L-BFGS 


11 


1.954e-002 


1.754e-010 






CG-D('sc(Hit 5.3 


.■)r)3G 


1.954o-()()2 


!).7r)3(--()()7 


palmer4c 


8 


mNewtow 


1 


5.031e-002 


1.123e-008 






L-CG-Descent 


11 


5.031e-002 


3.928e-009 






L-BFGS 


11 


5.031e-002 


3.928e-009 






CG-Dcsccnt 5.3 


44211 


5.031e-002 


9.657e-007 


palmer5c 


6 


mNewtow 


1 


2.128e-t-000 


1.447e-013 






L-CG-Descent 


6 


2.128e-|-000 


3.749e-012 






L-BFGS 


6 


2.128e+000 


3.749e-012 






CG-Descent 5.3 


6 


2.128e-F000 


2.629e-009 


palmer6c 


8 


mNewtow 


1 


1.639e-002 


7.867e-010 






L-CG-Dcsccnt 


11 


1.639e-002 


5.520e-009 






L-BFGS 


11 


1.639e-002 


5.520e-009 






CG-Descent 5.3 


14174 


1.639e-002 


7.738e-007 


palmerTc 


8 


mNewtow 


1 


6.020e-001 


9.090e-009 






L-CG-Descent 


11 


6.020e-001 


7.132e-009 






L-BFGS 


11 


6.020e-001 


7.132e-009 






CG-Dcscent 5.3 


65294 


6.020e-001 


9.957e-007 


palmerSc 


8 


mNewtow 


1 


1.598e-001 


1.099e-009 






L-CG-Descent 


11 


1.598e-001 


2.376e-009 






L-BFGS 


11 


1.598e-001 


2.376e-G09 






CG-Descent 5.3 


8935 


1.598e-001 


9.394e-007 


rosenbr 


2 


mNewtow 


20 


2.754e-013 


8.253e-007 






L-CG-Desccnt 


34 


4.691e-018 


7.167e-008 






L-BFGS 


34 


4.691e-018 


7.167e-008 






CG-Descent 5.3 


37 


1.004e-014 


1.894e-007 


sineval 


2 


mNewtow 


41 


5.590e-033 


2.069e-015 






L-CG-Descent 


60 


1.556e-023 


1.817e-011 






L-BFGS 


60 


1.556e-023 


1.817e-011 






CG-Dcscent 5.3 


62 


1.023e-012 


5.575e-007 


sisser 


2 


mNewtow 


15 


3.814e-010 


4.485e-007 






L-CG-Descent 


6 


6.830e-012 


2.220e-008 






L-BFGS 


6 


6.830e-012 


2.220e-008 






CG-Descent 5.3 


6 


3.026e-014 


3.663e-010 


tointcior 


50 


mNewtow 


1 


1.176e-)-003 


3.197e-014 






L-CG-Descent 


29 


1.175e-l-003 


4.467e-007 






L-BFGS 


28 


1.175e-|-003 


7.482e-007 






CG-Descent 5.3 


29 


1.175e-|-003 


4.464e-007 


vardim 


200 


mNewtow 


19 


1.365e-025 


7.390e-011 






L-CG-Descent 


10 


4.168e-019 


2.582e-007 






L-BFGS 


7 


5.890e-025 


3.070e-010 



12 







CG-Descent 5.3 


10 


4.168e-019 


2.582e-007 


watsoii 


12 


mNewtow 






4.202C-006 


1.918C-009 






L-CG-Dcsccnt 




49 


1.592e-007 


8.026e-007 






L-BFGS 




48 


9.340e-008 


1.319e-007 






CG-Dcscent 5.3 




726 


1.139e-007 


8.115C-007 


yfitu 


2 


mNewtow 




37 


6.670e-013 


2.432e-012 






L-CG-Descent 




75 


8.074e-010 


3.910e-007 






L-BFGS 




75 


8.074e-010 


3.910e-007 






CG-Descent 5.3 




147 


2.969e-011 


5.681e-007 



We summarize the comparison of the test results as follows: 

1. the modified Newton function mNewton converges in all the test problems after terminate condition 
||5(2;fc)||oo < 10"^ is met. For all problems except bard, cliff, deconvu, sisser, and vardim, 
the modified Newton uses fewer iterations than L-CG-Descent, L-BFGS, and CG-Descent 5.3 to 
converge to the minimum. Since in each iteration, mNewton needs more numerical operations, small 
iteration count does not mean superior efficiency, but it indicates some promising. 

2. For all the problems except the problem arglina, all algorithms find the same mininum. For the 
problem arglina, the modified Newton finds a better local minimum. 

Based on these test results, we believe that the new algorithm is promising. This leads us to use the 
similar idea described in this paper to develop a modified BFGS algorithm which is very promising in 
the numerical tests. 

5 Conclusions 

We have proposed a modified Newton algorithm and proved that the modified Newton algorithm is 
globally and quadratically convergent. We show that there is an efficient way to implement the proposed 
algorithm. We present some numerical test results. The results show that the proposed algorithm is 
promising. The Matlab implementation mNewton described in this paper is available from the author. 
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